DISTANCE PROPERTIES OF EXPANDER CODES 



ALEXANDER BARG* AND GILLES ZEMOR 

ABSTRACT. We study the minimum distance of codes defined on bipartite graphs. Weight spectrum and the 
minimum distance of a random ensemble of such codes are computed. It is shown that if the vertex codes have 
minimum distance > 3, the overall code is asymptotically good, and sometimes meets the Gilbert- Varshamov 
bound. 

Constructive families of expander codes are presented whose minimum distance asymptotically exceeds the 
product bound for all code rates between and 1 . 

1. Introduction 

1.1. Context. The general idea of constructing codes on graphs first appeared in Tanner's classical work 
[19]. One of the methods put forward in this paper was to associate message bits with the edges of a graph 
and use a short linear code as a local constraint on the neighboring edges of each vertex. M. Sipser and D. 
Spielman [16] generated renewed interest in this idea by tying spectral properties of the graph to decoding 
analysis of the associated code: they suggested the term expander codes for code families whose analysis 
relies on graph expansion. Further studies of expander codes include l23l fTTl l5l l4l l3l 1X71 . 

While [19 1 and [16] did not especially favor the choice of an underlying bipartite graph, subsequent 
papers, starting with [23 ], made heavy use of this additional feature. In retrospect, codes on bipartite graphs 
can be viewed as a natural generalization of R. Gallager's low-density parity-check codes. Another view 
of bipartite-graph codes involves the so-called parallel concatenation of codes which refers to the fact 
that message bits enter two or more unrelated sets of parity-check equations that correspond to the local 
constraints. This view ties bipartite-graph codes to turbo codes and related code families; the bipartite graph 
can be defined by a permutation of message symbols which is very close to the "interleaver" of the turbo 
coding schemes. 

A more traditional method of code concatenation, dating back to the classical works of P. Elias and 
G. D. Forney, suggests to encode the message by several codes successively, earning this class of con- 
structions the name serial concatenation. A well-known set of results on constructions, parameters and 
decoding performance of serial concatenations includes Forney's bound on the error exponent attainable 
under a polynomial-time decoding algorithm (9), implying in particular the existence of a constructive 
capacity-achieving code family, and the Zyablov bound on the relative distance attainable under the con- 
dition of polynomial-time constructibility ll24l . Initial results of this type for expander codes fl6l l23l 151 
were substantially weaker than both the Forney and Zyablov bounds, but additional ideas employed both in 
code construction and decoding led to establishing these results for the class of expander codes ll3l lTT1IT7l . 
In particular, paper [ 3 ] focussed on similarities and differences between serial concatenations and bipartite- 
graph codes viewed as parallel concatenations. We refer to this paper for a detailed introduction to properties 
of both code families. Paper [ 3 ] also suggested a decoding algorithm that corrects a fraction of errors ap- 
proaching half the designed distance, i.e. half the Zyablov bound. The error exponent of this algorithm 
reaches the Forney bound for serial concatenations. The advantage of bipartite-graph codes over the latter 
is that for them, the decoding complexity is an order of magnitude lower (proportional to the block length 
N as opposed to N 2 for serial concatenations). 

The main goal of was to catch up with the classical achievements of serial concatenation and show that 
they can be reproduced by parallel schemes, with the added value of lower-complexity decoding. One of 
the motivations for the present paper is to exhibit new achievements of parallel concatenation, unrelated to 
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decoding, that surpass the present-day performance of all codes constructed in the framework of the classical 
serial approach. 

1 .2. Bounding the minimum distance of expander codes. The main focus of this paper are the parameters 
[N, RN, 5N] of bipartite-graph codes, particularly the asymptotic behavior of the relative minimum distance 
S as a function of the rate R. Bipartite-graph codes, and more generally codes defined on graphs, are famous 
for their low-complexity decoding and its performance under high noise, but are generally considered to 
have poorer minimum distances than their algebraic counterparts. We strive here to reverse this trend and 
show that it is possible to design codes defined on graphs with very respectable R versus 5 tradeoffs. 

In the first half of this paper we study the average weight distribution of the random ensemble of bipartite- 
graph codes (Section|3ll- Under the assumption that the minimum distance of the small constituent codes is 
at least 3, we show that the ensemble contains codes which are asymptotically good for all code rates, and 
for some values of the rate reach the Gilbert- Varshamov (GV) bound. This result shows interesting parallels 
with a similar theorem for serial concatenations in Forney's sense It also generalizes the result of 

(8lE I where bipartite-graph codes with component Hamming codes were shown to be asymptotically good. 

In the second part of the paper we turn to constructive issues. Until now the product of the relative dis- 
tances of the constituent codes was the standard lower bound on the relative minimum distance of expander 
codes, as it is for the class of Forney's serially concatenated codes, including product codes. Efforts have 
been made to surpass this product bound, or designed distance, for short block lengths, see e.g. l20l . but no 
asymptotic improvements have been obtained for any of these classes. In Section|4]we describe two families 
of bipartite-graph codes that asymptotically surpasses the product bound on the minimum distance. In par- 
ticular we obtain a polynomially constructible family of binary codes that for any rate between and 1 have 
relative distance greater than the Zyablov bound [ 24 1. These constructions are based on allowing both binary 
and nonbinary local codes in the expander code construction and matching the restrictions imposed by them 
on the binary weight of the edges in the graph. This result confirms the intuition, supported by examples of 
short codes and ensemble-average results, of the product bound being a poor estimate of the true distance 
of two-level code constructions be they parallel or serial concatenations. Even though it does not match the 
distance of such code families as multilevel concatenations or serial concatenations with algebraic-geometry 
outer codes, this result is still the first of its kind because all the other constructions rely on the product bound 
for estimating the designed distance. In particular, the results of Section |4]improve over the parameters of all 
previously known polynomial-time constructions of expander codes and of concatenations of two codes not 
involving algebraic-geometry codes, including the constructions of Forney [9 24], Alon et al. ID, Sipser and 
Spielman [16|, Guruswami and Indyk Hill , the authors 0, and Bilu and Hoory |6). In the final Section|5] 
we compare construction complexity with other code families whose parameters are comparable to those of 
the bipartite-graph codes constructed in this paper. 

2. Preliminaries 

2.1. Bipartite-graph codes: Basic construction. Let G = (V, E) be a balanced, A-regular bipartite graph 
with the vertex set V = Vq U V±, \ Vq\ = \ V±\ = n. The number of edges is \E\ = N = An. 

Let us choose an arbitrary ordering of edges of the graph which will be fixed throughout the construction. 
For a given vertex v E G this defines an ordering of edges v(l),v(2), . . . , v (A) incident to it. We denote 
this subset of edges by E(v ). For a vertex v in one part of G the set of vertices in the other part adjacent to 
v will be also called the neighborhood of the vertex v, denoted N(v). 

Let A[A, RoA], B[A, R\A] be binary linear codes. The binary bipartite-graph code C(G]A, B) has 
parameters [N, RN] . We assume that the coordinates of C are in one-to-one correspondence with the edges 
of G. Let x E {0, 1}^. By x v we denote the projection of x on the edges incident to v. By definition, x is 
a codevector of C if 

(1) for every v E Vb, the vector x„ is a codeword of A; 

(2) for every w E V\, the vector x ffi is a codeword of B. 
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This construction and its generalizations are primarily studied in the asymptotic context when n — > oo, A = 
const. Paper [ 16] shows that, for a suitable choice of the code A, codes C(G; A, A) are asymptotically good 
and correct a fraction of errors that grows linearly with n under a linear-time decoding algorithm. Another 
decoding algorithm, which gives a better estimate of the number of correctable errors, was suggested in 
I23L Paper |5l shows that introducing two different codes A and B enables one to prove that the codes 
C(G; A, B) attain capacity of the binary symmetric channel. Note that taking A the parity-check code and 
B the repetition code we obtain Gallager's LDPC codes. For this reason codes C(G; A, B) are sometimes 
called generalized low-density codes I8l fl3l . 

Before turning to parameters of the code C let us recall some properties of the graph G. Let A be the 
second largest eigenvalue (of the adjacency matrix) of G. For a vertex u G Vo and a subset T C V% let 
deg T {v) be the number of edges that connect v to vertices in T. A key tool for the analysis of the code C is 
given by the following lemma. 

Lemma 1. Let S C Vo, T C V±. Suppose that 

y veS deg T (v) > a A, \/ wET deg s (w) > a±A, 

where ao,a± S (0, 1). Then 

From this, the relative distance of C satisfies 

where do = A5q, d\ = A5\ are the distances of the codes A and B. 
The rate of the code C is easily estimated to be 

(2) R> R + Ri-1. 

We will assume that the second eigenvalue A of the graph G\ = (Vo U V\, E\) is small compared to its 
degree Ai. For instance, the graph G\ can be chosen to be Ramanujan, i.e., A < 2VAi - L Then from[T]we 
see that the code C approaches the product bound 8q8\ which is a standard result for serial concatenations. 

2.2. Multiple edges. In [5 1 this construction was generalized by allowing every edge to carry t bits of the 
codeword instead of just one bit, where t is some constant. The code length then becomes nAt. We again 
denote this quantity by N because it will always be clear from the context which of the two constructions 
we consider. Let A[tA, Rot A] be a binary linear code and B[A, R±A] be a q-axy additive code, q = 2*. To 
define the code C(G; A, B) we keep Condition 1. above and replace condition 2 with 

2'. for every w £ V\ the vector x w , viewed as a q-ary vector, is a codeword in B. 

An alternative view of this construction is allowing t parallel edges to replace each edge in the original graph 
G. Then every edge again corresponds to one bit of the codeword. An advantage of the view offered above 
is that it allows a direct application of Lemmas 

In |3 it was shown that this improves the parameters and performance estimates of the code C. For 
instance, there exists an easily constructible code family C(G; A, A) of rate R with relative distance given 
by 

(3) 6 >±(l-R)h- 1 (^)-e (e>0) 

where h(-) is the binary entropy function. Note that the distance estimate is immediate from Lemma^ 

The generalized codes of this section together with some other modifications of the original construction 
will be used in Sect.|3]below. 
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2.3. Modified code construction. Let G = (V, E) be a bipartite graph whose parts are Vq (the left vertices) 
and V\ U V2 (the right vertices), where \ Vi\ = n for i = 0, 1, 2. The degree of the left vertices is A, the 
degree of the vertices in V\ is A\ and the degree of vertices in V2 is A2 = A — Ai . For a given vertex v G Vq 
we denote by E(v) the set of all edges incident to it and by Ei(v) C E(v),i = 1,2 the subset of edges of 
the form (v, w), where w G Vi. The ordering of the edges on v defines an ordering on Ei(v). Note that both 
subgraphs G% = (Vb U Vi, Ei), i = 1,2 can be chosen to be regular, of degrees Ai and A2 respectively. 

Let A be a [tA, R tA,d = tA5o] linear binary code of rate R = Ai/A. The code A can also be 
seen as a g-ary additive [A, RqA] code, q = 2*. Let fibea g-ary [Ai, RiA\, d\ = Ai<5i] additive code. 
We will also need an auxiliary q-wy code A aux of length A^ Every edge of the graph will be associated 
with t bits of the codeword of the code C of length N = ntA. The code C is defined as the set of vectors 
x = {x\, . . . , xn} such that 

(1) For every vertex v G Vq the subvector (xj)j e E(v) i s a (9 _ar y) codeword of yl and the set of coordi- 
nates £7i (v) is an information set for the code A. 

(2) For every vertex v G V\ the subvector (xj)j EE r v \ is a codeword of 5; 

(3) For every vertex v G Vq the subvector (a;j)jeBi(i)) i s a codeword of A aux . 

Both this construction and the construction from the previous subsection are illustrated in Fig. ^ 
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We will choose the minimum distance d aux = 5 aux Ai of the code yl aux so as to make the quantity A/<i aux 
arbitrarily small, where A is the second eigenvalue of G\. By choosing Ai large enough, the rate i? aux of 
A aux can be thought of as a quantity such that 1 — i? aux is almost 0(1/ \fA) . 

This construction was introduced and studied in |4]|3l. The code C has the parameters [N = ntA, RN, D] . 
The rate R is estimated easily from the construction: 

(4) R > R Rl - R (l - i?aux), 

which can be made arbitrarily close to RoRi by choosing A large enough but finite. 

The distance D of the code C can be again estimated from Lemma [2 applied to the subgraph G\. Then 
we have ao = J a ux, ol\ = Si, and 

<5) D ***{ 1 -±){ 1 -k) N - 

This means in particular that the relative minimum distance D/N is bigger than a quantity that can be made 
arbitrarily close to the product 5q5\. Together with © this means that the distance of the code C for n — > oo 
can be made arbitrarily close to the product, or Zyablov bound [ 24 1 

(6) 6z(R) = max S GV (x)(l - R/x). 

R<x<l 

This result was proved in (3J- 

Alternative description of the modified construction. The above code can be thought of as a serially 
concatenated code with A as inner binary code and a Q-ary outer code with Q = 2 tAl . The outer code is 
formed by viewing the binary t Ai-tuple indexed by the edges of G\ incident to a vertex of Vq as an element 
of the Q-ary alphabet. The Q-ary cod B' is defined by conditions 2 and 3 above, and C is obtained by 
concatenating B' with A. This description of the modified construction is used in [ 1 8 ] to show the existence 
of linear-time decodable codes that meet the Zyablov bound and attain the Forney error exponent under 
linear-time decoding on the binary symmetric channel as well as the Gaussian and many other communica- 
tion channels. Another closely related work is the paper fTTI where a similar description was used to prove 
that there exist bipartite-graph codes that meet the bound © and correct a 5z/2 proportion of errors under a 
linear-time decoding procedure. 



3. Random ensemble of bipartite-graph codes 

Let us discuss average asymptotic properties of the ensemble bipartite-graph codes. It has been known 
since Gallager's 1963 book [ 10 1 that the ensemble of random low-density codes (i.e., bipartite graph codes 
with a repetition code on the left and a single parity-check code on the right) contains asymptotically good 
codes whose relative distance is bounded away from zero for any code rate R £ (0, 1). Papers [8| and [13] 
independently proved that the ensemble of random bipartite-graph codes with Hamming component codes 
on both sides contains asymptotically good codes. Here we replace Hamming codes with arbitrary binary 
linear codes and show that the corresponding ensemble contains codes that meet the GV bound. 

Theorem 2. Let G = (VqUVi, E) be a random /^.-regular bipartite graph, Vq = V\ = n and let A[A, RqA] 
be a random linear code. For n — > oo the average weight distribution over the ensemble of linear codes 
C(G; A, A) of length N = nA and rate R is bounded above as A^n < 2 NF+o( - N ^ , where 



(7) 
(8) 



F = to[R - 1 - 2 log(l - 2 flo ~ 1 )] - h(uo) ifO<iv<l- 2 R °~ 1 
F = h{u) + R - 1 ifu>\- 2 flo " 1 
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Proof : Let if be a A(l — Rq) x A parity-check matrix of the code A. The parity-check matrix of the code 
C can be written as follows: TL = [TLi,TL2} t , where 

H 

H 



H 



(a band matrix with H repeated n times) and TL2 = niTti) is a permutation of the columns of TL\ defined 
by the edges of the graph G. 

To form an ensemble of random bipartite-graph codes assume that H is a random binary matrix with 
uniform distribution and that the permutation 7r is chosen with uniform distribution from the set of all 
permutation on JV = nA elements. Choose the uniform probability on = {0, 1}^ and endow the 
product space of couples (TL, x) with the product probability. 

The average number of codewords of weight w is 

(9) A w = Pr[Hx* = I w(x) = w}. 

Let us compute the probability Pr[Hx* = | w(x) = w]. 
Observe that 

Pr[Wx* = I w(x) =w] = Pr[Wix* = | w(x) = w] Pr[W 2 x* = | w(x) = w] 

(10) = (Pr[Hix* = I w(x) = w]) 2 . 

Let w = wnA. Let X m v} C {0, 1}^ be the event where x is of weight w and contains nonzero entries in 
exactly m groups of coordinates of the form (xiA+j ,j = 1, • • • , A; i = 0, . . . , n — 1). Let Wi = A be the 
number of ones in the ith group. We have 

By convexity of the entropy function (or by using Lagrange multipliers), the maximum of the last expression 
on wi, . . . , io m under the restriction £\ w « = ujn * s attained when u>i = ujn/m, i = 1, . . . , m. For large A 
we therefore have 

p ry 1 r*j r\—N+mAh(Lun/m) 
[<^-m,w\ — ^ 



Now we have 



and clearly 
so that 
and 



t, i* , f ~ 1 / n i Pr[Wix* = , w(x) = w] 

Pr Wix* = w x =w] = — 1 — . ' — - and 

Pr|w(x) = w\ 

PrfWix* = , w(x) = w] = Pr [Wi x * = » X m,w] 



Pr[W lX * = , X m , w ] = 2 mA ( R o-D Pr[X m<w ] 
Pr[Wix* = w(x) = w] = 2 _Ar+maXm ( mA (- Ro_1 ) +,nAfe ( a;n / m ^ 



Pr[Wix* = I w(x) = w] = 2 _/l ( w ) Ar+maXm (" lA (' Ro_1 )+' mA,l ( a;n / m )). 
Given © and dTOl . and setting x = m/n we obtain therefore A w = 2 NF ( Ro > x >, where 
F(R ,x) = -h(co) + 2 max (x(R - 1 + h(u/x))) + o(l) 

UJ<X<1 

(1 1) < -h(to) + max (x(R - 1 + 2h(u/x))) + o(l) 

u;<a;<l 
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The unconstrained maximum on x in the last expression is attained for x = xq = uj/(1 — z), where 
21ogz = R — 1. Thus, the optimizing value of x equals xq if this quantity is less that 1 and 1 otherwise. 
Substituting x = xq into (fTTT) and taking into account the equality R—1 + 2h(z) = 2(1 — z) log(z/(l — z)), 
we obtain 

F(R ,x) < -h(uj) + - 1 + 2h(z)) 

which is exactly ©. Substituting x = 1 we obtain the second part of the claim. I 

The result of this theorem enables us to draw conclusions about the average minimum distance of codes 
in the ensemble. From Q-® and the proof it is clear that the relation between these expressions is 

u[R- 1- 21og(l - 2 R °- 1 )] - h(u) > h{u)+R - 1 

(since xq in the previous proof is the only maximum point), and that this inequality is strict for uj < 1 — 
2*0-!. Thus if 1- 2 /?0 ~ 1 < 8g\(R), the first time the exponent of the ensemble average weight spectrum 
becomes positive is uj = 5qy(R). This would mean that for large n there exist codes in the bipartite-graph 
ensemble that approach the GV bound; however, there is one obstacle for this conclusion: since the exponent 
approaches for u — > the codes in principle can contain very small nonzero weights (such as w constant 
or growing slower than n). This issue is addressed in the next theorem, where a slightly stronger fact is 
proved, namely that there exists a constant e > such that on average, the distance of the codes C(G; A, A) 
is at least en. 

Theorem 3. Consider the ensemble of bipartite-graph codes defined in Theorem |2] Let uj* be the only 
nonzero root of the equation 

u(R-l - 21og(l - 2( 1 / 2 )( i? - 1 ))) = h(u). 
The ensemble average relative distance behaves as 

(12) S(R) = u* ifRo < Iog(2(l - S Gy (R))) 

(13) 5(R) = 5 GW (R) ifRo > log(2(l - 5 GY (R))) 

In particular, for R < 0.202 the ensemble contains codes that meet the GV bound. 

Proof : As argued in the discussion preceding the statement of Theorem |3j we only need to check that for 
sufficiently small w, the expected number of codewords of weight w in the ensemble is a vanishing quantity. 

Suppose the local code A has minimum distance d. Let w < cn, c a constant to be determined later, and 
set m = w/d. Let U(w,d) C {0, 1}^ be the set of vectors with the property that if for some i = 0, . . . , n— 1 
the subvector (xiA+j,j = 1, . . . , A) is nonzero, it is of weight at least d. Let Hi be as in the proof of 
Theorem |21 Then 

Prfftix = | w(x) =w}< Pr[x G U(w, d) \ w(x) = w]. 

Next 



Then 
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N~ log A, 





(a), R = 0.1 



(b) 



FIGURE 2. Average weight spectrum (a) and distance (b) of the ensemble of bipartite-graph codes) 

Recall that TL2 is obtained by randomly permuting the columns of Hi, so the expected number of codewords 
of weight w in the code is 



.ml \ a I \w 



Remember that A is fixed. Since Q > N w /w w , N = An, and (™) < e m n m /m m , we obtain 



A,, 



< e 



2m 



n 



2m 



2m 



^2dm22mA^ w ^jy^w <- ^2m n m{2-d) 



(Aw 



i2mA 



(w/d) 



2m ' 



< (s^n/w) w{2 - d)ld 



where s = (ed) 2 A d 2 2A is a constant independent of n. For any w < s d - 2 n, we get that the right-hand side 
of the last inequality tends to as n — > 00 whenever d > 3. I 



Corollary 4. Consider the ensemble C of bipartite-graph codes defined in Theorem |2 If the distance of 
the code A is at least 3 then the ensemble-average relative distance is bounded away from zero (i.e., the 
ensemble C contains asymptotically good codes). 

The results of the last two theorems (ensemble-average weight spectrum and relative distance) are shown 
in Fig. |2 Note that random bipartite-graph codes are asymptotically good for all code rates. We also 
observe that the behavior of the function log A^ # is similar to that of the logarithm of the ensemble-average 
weight spectrum for Gallager's codes (see [10|, particularly, p. 16) and of other LDPC code ensembles. 
It is interesting to note that Gallager's codes in fiHl become asymptotically good on the average once the 
number j of ones in the column of the parity check matrix is at least 3. Similarly, we need distance-3 local 
codes to guarantee relative distance bounded away from zero in the ensemble of bipartite-graph codes. 



We conclude this section by mentioning two groups of results related to the above theorems. 
1. A different analysis of the weight spectrum of codes on graphs with a fixed local code A was performed 
in [ lQjIHEl- Let A be an [A, RqA] linear binary code with weight enumerator a(y). Let A = X(lo) be the 
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N~ l log Aun N- 1 log A^n 




(a) (b) 



Figure 3. Average weight spectrum of the code C: (a) the local code A is the Hamming 
code (upper curve) and a random code (lower curve); (b) the local code A is the Golay 
[23, 12, 7] code (upper curve) and random code (lower curve). 



root of (lna(e s ))' s = Alo with respect to s. Let A^n be the component of the ensemble-average weight 
spectrum of the code C. As n — > oo, we have 151 1131 

(14) iV- 1 log 2 A^ N < h{u) - ^ (h^l _ Xu) + o(N). 

Note that this is a Chernov-bound calculation since a(e s ) is (proportional to) the moment generating function 
of the code A. Variation of this method can be also used to obtain Theorem EJ although the argument is not 
simpler than the direct proof presented above. On the other hand, the proof method of Theorem does not 
seem to lead to a closed-form expression for the ensemble-average weight spectrum for a particular code A 
(cf. Q). 

It is interesting to compare the weight spectrum Q-© to the spectrum (fl4l . For instance let A be the 
[7, 4, 3] Hamming code with a(y) = 1 + 7y 3 + 7y 4 + y 7 . We plot the spectrum of the code C in Fig|3ja) 
together with the weight spectrum (Q-© and do the same for A the [23, 12, 7] Golay code in Fig|5Jb). 
For the code C with local Hamming codes the parameters are: R > 1/7,5 > 0.186. The GV distance 
<fcv(l/7) ~ 0.281. For the case of the Golay code we have R > 1/23,5 > 0.3768. The GV distance in this 
case is 5 GY ( 1/23) » 0.3788. 

The main result of (HJ [T3l is that bipartite-graph codes with Hamming local codes are asymptotically 
good. We remark that this also follows as a particular case of Corollary |4] above. 

2. Recall the asymptotic behavior of other versions of concatenated codes, in particular serial concatena- 
tions. Consider the ensemble of concatenated codes with random [A, RqA] inner codes A and MDS outer 
codes B. The following results are due to E. L. Blokh and V. V. Zyablov [7] and C. Thommesen [21 ]. The 
average weight spectrum is given by A{Nuj) = 2 Af ( F +°( 1 )), where 

F = R- R -u log(2 1 - i?0 -1) 0<w<l - 2 R °- 1 
F = h(u) + R - 1 u>l-2 Ro ~ 1 . 
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The ensemble average relative distance is given by 




log(2 1 ^ fl ()-l) 



R > log(2(l - 5gv(R))) 
0<R < log(2(l - S Gy (R))) 



where R is the code rate. With all the similarity of these results to those proved in this section there is 
one substantial difference: with serial concatenations there is full freedom in choosing the rate Rq of the 
inner codes while with parallel codes once the overall rate is fixed the rate Ro of the code A is also fully 
determined. This explains the fact that serially concatenated codes meet the GV bound for all rates R while 
bipartite-graph codes do so only for relatively low code rates. 

Another result worth mentioning in this context [2] concerns behavior of serially concatenated codes with 
a. fixed inner code A and random outer q-aiy code B. This ensemble can be viewed as a serial version of the 
parallel concatenated ensemble of this section. It is interesting to note that for a fixed local code, the serial 
construction turns out to be more restrictive that parallel. In particular, [ 2 1 shows that serially concatenated 
codes with a fixed inner code and random outer code approach the GV bound only for rate R — ► 0. They 
are also asymptotically good, although below the GV bound, for a certain range of code rates depending on 
the code A. 

The results about the weight spectrum of bipartite-graph codes can also be used to estimate the ensemble 
average error exponent of codes under maximum likelihood decoding. This is a relatively standard calcula- 
tion that can be performed in several ways; we shall not dwell on the details here. Of course, for code rates 
R < 0.202 when the codes meet the GV bound and their weight spectrum is binomial, the error exponent 
of their maximum likelihood decoding will meet Gallager's bound Eq(R,p). Similar results were earlier 
established for serial concatenations I71 l22ll . 



In this section we present a constructive family of one-level, parallel concatenated codes that surpass the 
product bound on the distance for all code rates R £ (0,1). 

The intuition behind the analysis below is as follows. The distance of two-level code constructions 
such as Forney's concatenated codes and similar ones is often estimated by the product of the distances of 
component codes. In (J5jl this result is established for expander codes (note that its proof, different from the 
corresponding proofs for serial concatenations, is based on the expanding properties of the graph Gi). 

It has long been recognized that apart from some special cases (such as product codes and the like) the 
actual relative minimum distance of two-level codes often exceeds the relative "designed distance" which 
in this case is the product SqSi. To see why this is the case let us recall the serial concatenated construction 
which is obtained from an [m 3 k\, d{\ q-ary Reed-Solomon code B, q = 2 k ° and an [no, ko, do] binary code 
A. A typical codeword of the concatenated code C can be thought of as a binary no x ni-matrix in which 
the ith column, 1 < % < m, represents an encoding with the code A of the binary representation of the ith 
symbol of the codeword in B. 

A codeword of weight d^di in the code C can be obtained only if there exists a codeword of weight d\ in 
the code B in which every symbol is mapped on a codeword of weight do in the code A. By experience, the 
true distance of the code C exceeds the product bound substantially (for instance, on the average concate- 
nated codes approach the GV distance; see the end of Sect. |3]), although quantifying this phenomenon for 
constructive code families is a difficult problem. 

The situation is different for expander codes (we will analyze the modified construction of the previous 
section) because the component codes A and B are of constant length, so we can have more control of both 
the binary and the g-ary weight of the symbols in the codeword and still obtain a constructive code family. 
The analysis below is based on the following intuition: the codes A and B's roles are not symmetric. If 
the product bound SqSi were to be achieved by some codeword of C, then the subcodewords corresponding 
to vertices of V± would have a relatively low q-ary weight (equal to Si) but a relatively high binary weight, 
concentrated into few q-ary symbols. On the other hand the subcodewords corresponding to vertices of Vq 
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would spread out their binary weight among all their q-ary symbols each of which would have a relatively 
low binary weight. The edges of the bipartite graph correspond to symbols of the two codes, making these 
conditions incompatible. 

We now elaborate on this idea, beginning with the code construction of Sect. 12.21 The analysis in this 
case is simple and paves way for a more complicated calculation for the modified bipartite-graph codes and 
an improved distance bound. 

4.1. Basic construction. Let us estimate the minimum binary weight of a codeword in the code C(G; A, B) 
where G is a graph with a small second eigenvalue A. Recall that E{v) denotes the set of edges incident on 
a vertex v. 

Let us introduce some notation. Let x G C be a codeword. For a given vertex v, the subvector x. v G 
{0, 1} A * can be partitioned into A consecutive segments of t bits, We write x = (xi, . . . , xa), where 



each segment corresponding to its own edge e G E(v). The Hamming weight w(xj) will be also called the 
binary weight of the edge e, denoted u>b(e). The corresponding relative weight of the edge is denoted by 
LOb(e) = Wb(e)/t. We call an edge nonzero relative to the codeword x if w;,(e) ^ 0. The number of nonzero 
edges of x is called the q-wy weight of x. 

For a subset of vertices S C Vi, i = 1, 2 let E(S) = Li ve sE(v). For two subsets S G Vq, T G V\ denote 
by Gsut the subgraph of G induced by S and T and let (S, T) be the set of its edges. In particular, if T is 
just one vertex, we denote by (S, v) the set of edges that connect v and S. Let deg s (v) = \ (S,v)\. 

Consider a codeword x of the code C. Let S C Vq be the smallest subset of left vertices that contains all 
the nonzero coordinates of x, and let T C V\ be the same for right vertices. Formally, supp(x) C (S,T), 
and both S and T are minimum subsets by inclusion that satisfy this property. Note that all edges in G\Gsut 
correspond to zero symbols of x (but there may be additional zero symbols). 

Let 7 = 7(x) be the average, over all edges e that join a vertex of S to a vertex of T of the relative binary 
weight of e: 



Let v be some vertex, either of S or of T. Let us define two local parameters (3 V , 7„. These parameters are 
relative to the codeword x. 

• The quantity f3 v is defined as the average, over all non-zero edges e incident to v, of the relative (to 
t) binary weight u;&(e): 

• The quantity j v is defined as the average, over all edges e, zero or not, 

- that join v to a vertex of T if v G S, 

- that join v to a vertex of S if v G T, 

of the relative binary weight Ub(e) of e. For instance, if v G T, then 



Note that j v < (3 V . 

We will use the big-0 and little-o notation relative to functions of the degree A. For instance 0(l/\/A) 
denotes a quantity bounded above by c/A, where c does not depend on A. 
Before we proceed we need to recall the following "expander mixing" lemma: 

Lemma5. LetG= (VbUVi, E) be a /^.-regular bipartite graph, \Vq\ = \V\\ = n, with second eigenvalue X. 
Let S C Vq, \S\ = an. Let a > X/2aA. Let U C V\ be defined by U = {v G V\ : deg s (u) > (1 + a)oA}, 



Xj = (x t( i^ +j , 1 <j<t), 1 < i < A 



(15) 





then 



\U\< 



A 



\S\. 



2a Aa - A 
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Below we assume that G is a Ramanujan graph implying that A < 2y/ A — 1. Recall from Lemma^that 
since 5q and 5i are fixed, the value a is lowerbounded by a quantity independent of A and can be thought 
of as a constant in the following analysis. 

Using this in the above lemma, we obtain 

(16) l^l<^^< ™ 



2aAa- X ~ aay/A-l y/A' 

where c = c(a,a) = l/(a — (1/ay/A)). We will choose a to be a quantity that, when A grows, tends to 
zero and is such that ay/A tends to oo: what (fTBT l shows us is that \U\/n is a vanishing quantity when A 
grows, which we will write as \U\/n = oa(1)- Similarly, applying Lemma |5] in the same way to the set 
S = Vq\S, we obtain the following corollary. 



Corollary 6. Le? a be such that a = oa(1) l/ay/A = oa(1)- Let 

R a = { v e Vx : (1 - o)ctA < deg s (u) < (1 + a)aA}. 

Then 1 — \R a \/n = oa(1)- 

What the expander lemma essentially says is that for any set S of vertices of Vq, almost every vertex of 
V\ will have a proportion of its edges incident to S that almost equals a = \S\/n. Going back to the sets 
S and T associated to the codeword x, the consequence of this is that 7 is essentially obtained by simple 
averaging of the 7„'s: more precisely, 



Lemma 7. 
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Proof: For instance, let us prove the second equality. Let \S\ = an, \T\ = rn. 
First write 

\(S,T)\ = Y,deg s (v) = de Ss(v) + E de ^ 

v£T vGTnR a T\R a 



E 

veTnR 



A(ct + oa(1)) + nAo A (l) 

by Corollary |5] We obtain, again by Corollary [51 

S,T] 
nA 



(17) Mlp = CJT + 0A ( 1) . 



Next, by definition of 7 and 7^, 



|(5,r)|7 = Edeg 5 («)7«. 



As above, partition T into T n R a and T \ R a and apply Corollary |6]to obtain 

|(S,T)| 7 = E f 7A7 t) + nA OA (l) 



(18) = E aA > + "Ao A (l). 

Now rewriting (fTTb as | (5, T)\ = |T|<7A(1 + oa(1)) and dividing it out of (fT8t gives the result. I 

Our strategy will be to consider 7 as a parameter liable to vary between and 1. For every possible 7 we 
shall find a lower bound for the total weight £(7) of x and then minimize over 7. We have introduced the 
two local parameters [3 V and 7^ for a technical reason: the quantity (3 V is the natural one to consider when 
estimating the weight of the local code at vertex v. However averaging the (3 v 's when v ranges over S or T 
is tricky while LemmaQenables us to manage the averaging of the 7^ conveniently. 



DISTANCE OF EXPANDER CODES 



13 



Now we introduce the constrained distance of A: it is denned to be any function Sq(0) of € (0, 1) that 

• is U-convex, continuous for /3 bounded away from the ends of the interval, and is non-decreasing in 

0, 

• is a lower bound on the minimum relative binary weight of a codeword of A under the restriction 
that the average binary weight of its nonzero edges is equal to (5t. 

The next lemma should explain the purpose of this definition. 

Lemma 8. Let x be some codeword of C and let S, \ S\ = an and 7 be the quantities defined above. The 
binary weight Wf>(x) = uj(x)N satisfies 

w(x) > <r6o(i) + o(l). 

Proof: We clearly have 

^(*)>4E 5 o(&)- 

Now notice that by their definition f3 v > j v so that 8q{0 v ) > 5o(7i;) since ^o(-) is non-decreasing. Further- 
more, by convexity and uniform continuity of 5o and by Lemma0 

^J>(7,)>5o(7)+°(l). 

151 ves I 

Next we bound a from below as a function of 7. We do this in two steps. The first step is to evaluate a 
constrained distance Si (0) for B defined as the minimum relative g-ary weight of any nonzero codeword of 
B such that the average binary weight of its nonzero symbols (edges) is equal to 0t. 

The following lemma is an existence result obtained by the random choice method, but since the code B 
is of fixed size, it can be chosen through exhaustive search without compromising constructibility. 

Lemma 9. For any e > 0, and t and A large enough, there exist codes B of rate R\ such that for any 
< < 1, the minimum relative (3-constrained q-ary weight of B satisfies 

Proof: We use random choice analysis: let us count the number N w of vectors z € {0, 1}* A of q-ary weight 
w = wA such that the average binary weight of its non-zero g-ary symbols is (it. Let Wi, i = I, ... ,w be 
the weights of these non-zero t-tuples. We have: 

e n(;„ 

By convexity of entropy, for sufficiently large A and t, the largest term on right-hand side is when all the Wi 
are equal. Then 

N w < ( A )(tY <2" Ath ^ 



when t is large enough. Hence, for a randomly chosen code of rate R\, the number A^^ of /3-constrained 
codewords of relative weight uj has an expected value 

A uj(i < 2^(^1-1+^03)). 

As long as u is chosen so that the above exponent is less than zero, there exists a code whose /3-constrained 
minimum distance is at least u: furthermore, since the number of possible values of (u, 0) (for which w and 
wf3t are integers) is not more than polynomial in t A, we obtain the existence of codes that satisfy our claim 
for all values of 0. I 

Comments: For < 5gv(Ri) we obtain values of 5\{0) that are greater than 1. This simply means that 
no /3-constrained codewords exist. 
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It follows from [21 1 that the same bound on the /^-constrained distance can be obtained for Reed-Solomon 
codes over GF{2 t ) whose symbols are mapped to binary t-vectors by random linear transformations. Thus, 
it is possible to prove the results of this section restricting oneself to Reed-Solomon q-wy codes B. 

From now on we assume that B is chosen in the way guaranteed by Lemma[5] We can now prove: 

Lemma 10. Let e > 0. For a codeword x 6 C, let S and 7 be defined as in (TT31 and let a = \S\/n. There 
exist A and t such that 

where h{(3) is defined as h{(3) = h(p)for < p < 1/2 and h(f3) = I for 1/2 < /3 < 1. 

Proof : By Lemma we have YIv&tIv < j\T\ — e. Therefore there must be a possibly small but non- 
negligible subset of right vertices T\ C T (namely at least \T\\ > s\T\) for which 7„ < 7 + e. By 
Corollary [5] the subset of vertices T2 C T\ that do not satisfy (1 — a)oA < deg s (v) < (1 + a)oA is of 
size |T 2 | = ?i.oa(1) for a arbitrarily small. Consider a vertex v G 7i\T 2 and let u> v be its relative q-aiy 
weight. Let a' be the proportion of nonzero edges among the edges from v into S. Since deg s (t>) can be 
taken to be arbitrarily close to a A we write, dropping vanishing terms, a' = u v /a. By their definitions we 
have V = jv/ot'. By Lemma|9]we have uj v > (1 — Ri)/h(f3 v ) and therefore 

l-i?i 
a > . 

But, noticing that the function h is n-convex, we have h(x) > a'h{x/a') for any x and any a' < 1, so that 
a > (1 — R\)/h(^ v ). Finally, we have h(x) < h(x) for every x and since 7„ < 7 (omitting e terms) and h 
is non-decreasing we have h{^ v ) < ^(7) which proves the result. I 

Let us now estimate <5o(/3). 

Lemma 11. Let e > 0, < /3 < 1, A(/?) = (3/h{0). For sufficiently large A and t there exists a code A for 
which a suitable function 5o(f3) is given by 

(19) S (P) = (l-R )g(P), 

where 

• g(P) = 5 GV (R )/(1 - Ro) if (3 < 5 GV (R ), 

• g ((3) = A(/3) if5 G v(Ro) < P and Ro < 0.284, 

• If5 GV (R ) < (3 and 0.284 < i? < 1, 

(20) g{0)= a/ ! + f„, <5gv(-Ro) <P<Pi 

Pi - o G v{Ro) 

(2D g{P) = KP) Pi<P<h 

where Pi is the largest root of 

h{P) (p - h{P) 5 -f^) = -(P - 5 G v(R )) log(l - P), 

a = X(p 1 ) - X(5 GY (R )), b = \(5 G v(Ro))Pi - \(Pi)5 Gy (R ) 

Proof: We again apply random choice: more precisely, let A be chosen to have rate Rq and satisfy Lemma|9] 
Lemma[9]applied to the code A tells us that if pt is the average binary weight of the nonzero q-aiy symbols 
of some codeword, then this codeword must have q-ary weight at least A(l — Ro)/h{P): for P < 5 G v(Ro) 
this quantity is larger that A, meaning that such a codeword doesn't exist and we may choose any value we 
like for 5o(P). For P > 5 G v{Ro)> since the total binary weight of the codeword equals pt times its q-aiy 
weight we obtain that this codeword has total binary weight at least iA(l — R )X(P). 

Now the function A(/3) = P/h{P) is convex for 0.197 < P < 1. Thus if Rq < 1 - /i(0.197), we can 
define 6 O (0) = 5 GV (R ) for < p < 5 GV (R ) and S (p) = A(/3) for 5 GV (R ) < p < 1. For greater 
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FIGURE 4. Estimates of the constrained distance of the code A and of a. 



values of the rate Rq we must replace the non-convex part of the curve \{(3) with some convex function 
such as a tangent to this curve. This results in elementary but cumbersome calculations which lead to the 
claim of the lemma. I 

The behavior of the functions 5q{[3) and a = cr(f3) is sketched in Fig.|4] 

Now together, Lemmas 181 171)1 and ITT1 give us the following lower bound on the relative distance 5 of the 

code C(G; A,B): 

5> min h-RoVl-RAiW. 

Since g{[3) is non-decreasing and h{(3) is constant for (3 > 1/2, the minimum is clearly achieved for 
j3 < 1/2: similarly, h{/3) is non-decreasing and g{(3) is constant for (3 < 5gv{Ro) so that the minimum 
must be achieved for (3 > 5gy(Rq). We can therefore limit (3 to the interval (5gy(Rq), 1/2) and replace 
h(f3) by h{j3). Optimizing on Rq to get the best possible 5 for a given code rate R, we get: 

5> max min (1 - R )(l - Ri)^r. 

Ro,Ri <5 G v(#o)</3<1/2 hi. 13) 

The full optimization is possible only numerically, but we can make one simplification which entails only 
small changes in the value of 5{R). Namely, let us optimize on the rates of component codes Rq, Ri ignoring 
the dependence of g{(3) on Rq. Let us choose Rq to satisfy 1 — Rq = ^(1 — R), then 1 — R\ > |(1 — R), 
and we obtain the bound given in the following theorem. 

Theorem 12. There exists an easily constructible family of binary linear codes C(G; A, B) of length N = 
nA, n — > oo and rate R whose relative distance satisfies 

(22) 5{R)>\{1-Rf min - e (e > 0), 

where g{(3) is defined in \20\ - \21\ . 

We remark that the bound d2~2l is very close to (but below) the curve 0.0949(1 — R) 2 which can be used 
for rough comparison with other bounds in this context. 

Note that the overall complexity of finding the codes A, B does not grow with n, so the complexity of 
constructing the code C is proportional to the complexity of constructing the graph G. 
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4.2. Modified construction. Let us consider a modified expander code C(G; A, B, ^4 a ux) of Section 1231 
which should be consulted for notation. We wish to repeat the argument of the previous section. 

The quantities 7 and are defined as before, but on the subgraph G\ = Vq\JVx,E\. For G (0, 1) 
we again define 6i {(5) as the minimum relative g-ary weight of any nonzero codeword of B such that the 
average binary weight of its nonzero symbols (edges) is at most (3t. Lemmas |8j |9] [TO] hold unchanged. 
The definition of 5q((3) is somewhat more complicated however, because the weight of the check edges of 
the code A is now unconstrained. Let us fix an information set I of RqA q-ary symbols (for definiteness, 
suppose that they occupy the first coordinates of the codeword). Let 5${f3) be 

• a U-convex continuous function of (3 

• a lower bound the minimum relative binary weight of a codeword of A under the restriction that the 
average binary weight of its nonzero edges in / is at least [5t. 

The following lemma gives an estimate for 5q((3) that will replace Lemma[TT1 



Lemma 13. For any e > 0, for Ai and t large enough, for any Rq, there exist codes of rate Rq such that, 
for every (3, 5q{(3) + e is greater than any convex function that does not exceed oj{(3), where is the root 
of the equation in co 

1 - Rq = max i? ^i-^ + (1 — -^0)^(^2), 

R tjJi + (l—R )u>2=u P 

where uj and uj\ are constrained by h~ l {l — Rq) < uj < (3 and uj\ < j3. 

Proof : Let A be a linear code of rate Rq and relative distance 5gy(Rq). Let x G A be a codeword of weight 
w = u>tA such that its average binary weight of nonzero symbols in / is (3t. Denote by ui the proportion of 
nonzero bits among the symbols (edges) of / and let UJ2 be the same for J = {1, . . . , n}\I. Let us estimate 
the total number M w of such vectors x. As in previous proofs, we take the assumption that each nonzero 
symbol in / if of weight exactly (3t, justified by the fact that the entropy function is n-convex. Hence the 
number of nonzero symbols in / is 

RouJiAt RqAuji 
pi = P ' 

With regard to the symbols in J there are no restrictions apart from their total weight. Hence 

M w ~ max ( (1 - R f ^\ (?W))«o**/f>. 

Then, with respect to the ensemble of random linear binary [At, RoAt] codes, the probability that 5o(f3) < 

Pr[w(x) = un] < 2- M ^- Ro ^M w . 
For any u > u>(f3) this probability is less than 1, so there exist codes that satisfy the claim of the lemma. I 

Optimization on U\,U2 in Lemma\n\ We need to maximize the function 

F = R u: 1 f ^ + (l-R )h(u J2 ) 

on u>i,L02 under the condition -Ro^i + (1 — ^0)^2 = oj. The maximum is attained for 

wi = -^-(w - (1 - Ro)a(P)), L02 = a(P), 

where a(/3) = (2 h ^IP + l) -1 . Substituting these values into the expression for F and equating the result 
to 1 — Rq, we find the value of u> : 

co = u*(P) := (1 - R ) [a(J3) + jT^0-~ 
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Next recall that uj,uj\ are constrained as follows: 5gw{Ro) < uj < /?, (3 > uj±. As it turns out, the 
unconstrained maximum computed above contradicts these inequalities for values of f3 close to 5gy(Ro), 
namely, the value of to falls below <5gv(-Ro)- Therefore, define 0i to be the (only) root of the equation in (3 

5 G y(R ) =lo*((3). 

For (3 G [5gv(Rq), (3{\ let us take uj\ = (3, u>2 = a{(3). We then use the condition F = 1 — Rq to compute 

w = uj**{(3) := R f3 + (1 - i^o)^ 1 (l - JZ^ h (P)] 
Concluding, the value of uj{f3) in Lemma[T3lis given by 



UJ**(fi) 5 G y(R )<P<p 1 
(3 1 <(3< 1/2. 



By definition, the relative distance 8q{[3) is bounded below by any convex function that does not exceed 
uj(P). The function uj{/3) consists of two pieces, of which uj** is a convex function but uj* is not. We then 
repeat the same argument as was given after Lemma fTTl replacing uj*{(3) with a tangent to uj**{(3) drawn 
from the point (1/2, w*(l/2)). This finally gives the sought bound on the function 5q([3). We wish to spare 
the reader the details. 

The overall distance estimate follows from Lemmas l8l ITOl l9l and the expression for 5q{(3) found above. 
As before (3 can be limited to the interval (5gv{Rq), 1/2). There is one essential difference compared with 
the previous section: the rates Rq, R\ of the component codes are constrained by © rather than ©. Since 
fi aux is small, essentially we have R = RqR\. Thus we obtain the following result. 

Theorem 14. There exists an easily constructible family of binary linear codes C(G; A, B) of length N = 
nA, n — > oo whose relative distance satisfies 

1-R/Ro] 



(23) 5(R) > max min \6 (f3, R ) , 

R<R <1 6 m {Ra)</3<l/2 I h{(3) J 

Let us compare the distance estimates derived in this theorem and in <!22b with the product bound 
6gy(Ro)8gy(Ri) on the code distance. Taking account of the fact that the code B is over the alphabet 
of size q = 2* and that for sufficiently large t, the bound 5gv{R\) comes arbitrarily close to 1 — R\, we 
obtain the (Zyablov) bound © 

5z(R)> max 5 G v{R )(l - R/Rq). 

R<Ro<1 

In the table below we compute this bound and the new results obtained in this paper. The bounds are also 
shown in Figure |5] 

R 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

5 Z (R) 0.129 0.073 0.044 0.026 0.015 0.008 0.0040 0.0015 0.00030 

improved bound (|22li 0.077 0.061 0.046 0.034 0.024 0.015 0.0084 0.0037 0.00089 ' 
improved bound (El 0.148 0.095 0.063 0.041 0.026 0.015 0.0078 0.0031 0.00073 

It is interesting that an improvement of the product bound is obtained already with the basic construc- 
tion of Sect. 14.11 Moreover, for large rates this construction gives codes with a distance larger than of the 
modified bipartite-graph construction of Sect. 14.21 On the other hand, the modified construction asymptoti- 
cally improves the product bound for all values of the code rate other than and 1. Both code families are 
polynomially constructible. 

Concluding this section we remark that in principle, the techniques presented here will generalize to yield 
an improvement of the bound [6 1 for codes from hypergraphs. 
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FIGURE 5. Parameters of expander codes: (a) Bound ©, (b) Bound d22l . (c) product 
(Zyablov) bound, (d) Bound CT . 



5. Families of asymptotically good binary codes and their construction complexity 

Here we compare the parameters and construction complexity of other asymptotically good families of 
binary codes with the families of bipartite-graph codes presented in the previous section. 

First note that the complexity of specifying the bipartite-graph codes is proportional to the complexity of 
describing the graph G which is 0(N log N) if the graph is represented by a permutation of the N vertices 
in any one part. For most Ramanujan graphs, constructing this permutation has complexity not more than 
0(N log N). The complexity of constructing codes meeting the Zyablov bound © in the traditional way is 
at most 0(N 2 ), by a combination of the results in I24| [l4l . 

Codes of distance greater than that given by the Zyablov bound can be constructed as multilevel con- 
catenations or as concatenations of good binary codes of relatively small length with algebraic geometry 
codes from asymptotically maximal curves [12]. The parameters of multilevel concatenations of order m 
are given by the following Blokh-Zyablov bounds [7]: 

*M(5)= max { Ro - ^ f:\5 G J Ro^^Y 1 }}, to = 1,2,.... 
R <i-h(S) I m f-f L V to / J J 

(in this case it is more convenient to specify the code rate R for a fixed value of the relative distance 
5). Note that for m = 1 we again obtain ®. The value R^ m \S) increases monotonically with m for 
any < S < 1/2 and thus these codes surpass the Zyablov bound for all to > 2. Their construction 
complexity is 0(N ma,x ( 2 '( m / - R )~ 1 ) ) which is higher than the complexity of constructing the bipartite-graph 
codes, particularly for low code rates. The bound i?( m ) (<5) is better than the value of the rate obtained for 
bipartite-graph codes beginning with to = 4 or so. 

The largest code rate of multilevel concatenations is obtained by letting m — > oo. The resulting bound is 
given by 

R(°°\5) = I - h(5) - 5 

Jo o G v(x) 

(this expression is called the Blokh-Zyablov bound). 

Concatenations of algebraic-geometry codes with short binary codes introduced in [12| improve the 
Blokh-Zyablov bound R(°°) for all < 5 < 1/2 (but do not meet the GV bound on the ensemble-average 
distance of concatenated codes). The code rate of these codes is the largest known asymptotically (for a 
given 5) among families of binary codes with polynomial construction complexity. The construction com- 
plexity of this code family is 0(N 3 log 3 N) by a recent result of K. Shum et al. [ 15 1. 

Perhaps the most important is the fact that even though the two families mentioned above have better 
parameters than bipartite-graph codes, their designed distance in both cases is estimated by the product 
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bound. On the contrary, the code families constructed in this paper provide an asymptotic improvement of 
the product bound on the distance. 
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